The NYC restaurant inspections dataset has 397584 observations and 18 variables. Some observations have no dba, which have been filtered out from later analysis. Some observations have a date of 1900-01-01, which is obviously in error. These were also filtered out from subsequent analysis showing critical violations by stores over the years.

data("rest_inspec")

summary(rest_inspec)
##     action              boro             building             camis         
##  Length:397584      Length:397584      Length:397584      Min.   :30075445  
##  Class :character   Class :character   Class :character   1st Qu.:41227319  
##  Mode  :character   Mode  :character   Mode  :character   Median :41622444  
##                                                           Mean   :44534756  
##                                                           3rd Qu.:50011150  
##                                                           Max.   :50071063  
##                                                                             
##  critical_flag      cuisine_description     dba           
##  Length:397584      Length:397584       Length:397584     
##  Class :character   Class :character    Class :character  
##  Mode  :character   Mode  :character    Mode  :character  
##                                                           
##                                                           
##                                                           
##                                                           
##  inspection_date                  inspection_type       phone          
##  Min.   :1900-01-01 00:00:00.00   Length:397584      Length:397584     
##  1st Qu.:2015-03-17 00:00:00.00   Class :character   Class :character  
##  Median :2016-02-03 00:00:00.00   Mode  :character   Mode  :character  
##  Mean   :2015-09-27 22:03:03.41                                        
##  3rd Qu.:2016-12-13 00:00:00.00                                        
##  Max.   :2017-10-17 00:00:00.00                                        
##                                                                        
##   record_date                         score           street         
##  Min.   :2017-10-19 06:00:49.00   Min.   : -2.00   Length:397584     
##  1st Qu.:2017-10-19 06:00:49.00   1st Qu.: 11.00   Class :character  
##  Median :2017-10-19 06:00:49.00   Median : 15.00   Mode  :character  
##  Mean   :2017-10-19 06:00:49.06   Mean   : 18.93                     
##  3rd Qu.:2017-10-19 06:00:49.00   3rd Qu.: 24.00                     
##  Max.   :2017-10-19 06:00:59.00   Max.   :151.00                     
##                                   NA's   :22642                      
##  violation_code     violation_description    zipcode         grade          
##  Length:397584      Length:397584         Min.   :10001   Length:397584     
##  Class :character   Class :character      1st Qu.:10022   Class :character  
##  Mode  :character   Mode  :character      Median :10468   Mode  :character  
##                                           Mean   :10675                     
##                                           3rd Qu.:11229                     
##                                           Max.   :11697                     
##                                           NA's   :5                         
##    grade_date                    
##  Min.   :2012-05-01 00:00:00.00  
##  1st Qu.:2015-03-30 00:00:00.00  
##  Median :2016-02-17 00:00:00.00  
##  Mean   :2016-01-31 05:45:17.54  
##  3rd Qu.:2016-12-13 00:00:00.00  
##  Max.   :2017-10-17 00:00:00.00  
##  NA's   :204287
rest_inspec |> 
  filter(grade == "C" & critical_flag == "Critical") 
## # A tibble: 4,623 × 18
##    action          boro  building  camis critical_flag cuisine_description dba  
##    <chr>           <chr> <chr>     <int> <chr>         <chr>               <chr>
##  1 Violations wer… MANH… 365      4.14e7 Critical      Asian               ALPH…
##  2 Violations wer… MANH… 370      5.00e7 Critical      Asian               HOA …
##  3 Violations wer… MANH… 11       4.17e7 Critical      Korean              MARU 
##  4 Violations wer… MANH… 537      5.00e7 Critical      Café/Coffee/Tea    CORS…
##  5 Violations wer… MANH… 35       4.13e7 Critical      Korean              MADA…
##  6 Violations wer… MANH… 150      5.00e7 Critical      American            MADI…
##  7 Violations wer… MANH… 312      4.14e7 Critical      American            CAFE…
##  8 Violations wer… MANH… 249      5.01e7 Critical      American            PARS…
##  9 Violations wer… MANH… 229      5.00e7 Critical      Chinese             GRAN…
## 10 Violations wer… MANH… 0        4.06e7 Critical      American            KABO…
## # ℹ 4,613 more rows
## # ℹ 11 more variables: inspection_date <dttm>, inspection_type <chr>,
## #   phone <chr>, record_date <dttm>, score <int>, street <chr>,
## #   violation_code <chr>, violation_description <chr>, zipcode <int>,
## #   grade <chr>, grade_date <dttm>
rest_inspec |>
  filter(!(boro == "Missing")) |> 
  mutate(
    boro = factor(boro),
    boro = fct_relevel(boro, c("STATEN ISLAND", "BRONX", "QUEENS", "MANHATTAN", "BROOKLYN"))
  ) |> 
  plot_ly(x = ~boro, y = ~score, color = ~boro, type = "violin", colors = "viridis") 
## Warning: Ignoring 22636 observations

This code chunk takes the total number of critical reports by business and orders them in descending order to see the most frequently reported businesses. Dunkin’ Donuts and Dunkin’ Donuts/Baskin Robbins are their own separate entry, so these are combined to Dunkin’ Donuts. head is used to extract the top 20 worst offenders.

rest_inspec |> 
  filter(critical_flag == "Critical") |> 
  mutate(
    dba = case_match(
      dba,
      "DUNKIN' DONUTS, BASKIN ROBBINS" ~ "DUNKIN' DONUTS",
      .default = dba
    )
  ) |>  
  summary()
##     action              boro             building             camis         
##  Length:218913      Length:218913      Length:218913      Min.   :30075445  
##  Class :character   Class :character   Class :character   1st Qu.:41232846  
##  Mode  :character   Mode  :character   Mode  :character   Median :41624158  
##                                                           Mean   :44549022  
##                                                           3rd Qu.:50011159  
##                                                           Max.   :50070808  
##                                                                             
##  critical_flag      cuisine_description     dba           
##  Length:218913      Length:218913       Length:218913     
##  Class :character   Class :character    Class :character  
##  Mode  :character   Mode  :character    Mode  :character  
##                                                           
##                                                           
##                                                           
##                                                           
##  inspection_date                  inspection_type       phone          
##  Min.   :2012-05-01 00:00:00.00   Length:218913      Length:218913     
##  1st Qu.:2015-03-25 00:00:00.00   Class :character   Class :character  
##  Median :2016-02-10 00:00:00.00   Mode  :character   Mode  :character  
##  Mean   :2016-01-29 17:45:27.47                                        
##  3rd Qu.:2016-12-20 00:00:00.00                                        
##  Max.   :2017-10-17 00:00:00.00                                        
##                                                                        
##   record_date                      score           street         
##  Min.   :2017-10-19 06:00:49   Min.   : -2.00   Length:218913     
##  1st Qu.:2017-10-19 06:00:49   1st Qu.: 12.00   Class :character  
##  Median :2017-10-19 06:00:49   Median : 17.00   Mode  :character  
##  Mean   :2017-10-19 06:00:49   Mean   : 20.76                     
##  3rd Qu.:2017-10-19 06:00:49   3rd Qu.: 26.00                     
##  Max.   :2017-10-19 06:00:49   Max.   :151.00                     
##                                                                   
##  violation_code     violation_description    zipcode         grade          
##  Length:218913      Length:218913         Min.   :10001   Length:218913     
##  Class :character   Class :character      1st Qu.:10022   Class :character  
##  Mode  :character   Mode  :character      Median :10467   Mode  :character  
##                                           Mean   :10672                     
##                                           3rd Qu.:11229                     
##                                           Max.   :11697                     
##                                                                             
##    grade_date                    
##  Min.   :2012-05-01 00:00:00.00  
##  1st Qu.:2015-03-31 00:00:00.00  
##  Median :2016-02-17 00:00:00.00  
##  Mean   :2016-01-30 12:45:31.42  
##  3rd Qu.:2016-12-09 00:00:00.00  
##  Max.   :2017-10-17 00:00:00.00  
##  NA's   :118118
rest_inspec |> 
  filter(critical_flag == "Critical") |> 
  mutate(
    dba = case_match(
      dba,
      "DUNKIN' DONUTS, BASKIN ROBBINS" ~ "DUNKIN' DONUTS",
      .default = dba
    )
  ) |> 
  group_by(dba) |> 
  summarize(critical_reports = n()) |> 
  arrange(desc(critical_reports)) |> 
  head(n = 20) |> 
  mutate(
    dba = factor(dba),
    dba = fct_reorder(dba, critical_reports)
  ) |> 
  plot_ly(x = ~dba, y = ~critical_reports, color = ~dba, type = "bar", colors = "viridis") 
rest_inspec |> 
  filter(critical_flag == "Critical") |> 
  mutate(
    dba = case_match(
      dba,
      "DUNKIN' DONUTS, BASKIN ROBBINS" ~ "DUNKIN' DONUTS",
      .default = dba
    ), 
    year = year(inspection_date)
  ) |> 
  group_by(dba, year) |> 
  summarize(critical_reports = n()) |> 
  arrange(desc(critical_reports), year) |> 
  head(n = 1000) |> 
  mutate(
    dba = factor(dba),
    dba = fct_reorder(dba, critical_reports)
  ) |> 
  plot_ly(x = ~dba, y = ~critical_reports, color = ~dba, type = "bar", colors = "viridis")  
## `summarise()` has grouped output by 'dba'. You can override using the `.groups`
## argument.

This code chunk gets establishments that were closed (or re-closed) among those with critical reports

rest_inspec |> 
  filter(critical_flag == "Critical" & str_detect(action, "[Cc]losed"))  |> 
  summary()
##     action              boro             building             camis         
##  Length:7013        Length:7013        Length:7013        Min.   :40364179  
##  Class :character   Class :character   Class :character   1st Qu.:41353598  
##  Mode  :character   Mode  :character   Mode  :character   Median :41719174  
##                                                           Mean   :45622097  
##                                                           3rd Qu.:50035038  
##                                                           Max.   :50070321  
##                                                                             
##  critical_flag      cuisine_description     dba           
##  Length:7013        Length:7013         Length:7013       
##  Class :character   Class :character    Class :character  
##  Mode  :character   Mode  :character    Mode  :character  
##                                                           
##                                                           
##                                                           
##                                                           
##  inspection_date                  inspection_type       phone          
##  Min.   :2013-03-08 00:00:00.00   Length:7013        Length:7013       
##  1st Qu.:2015-05-28 00:00:00.00   Class :character   Class :character  
##  Median :2016-07-25 00:00:00.00   Mode  :character   Mode  :character  
##  Mean   :2016-05-01 03:51:49.25                                        
##  3rd Qu.:2017-06-05 00:00:00.00                                        
##  Max.   :2017-10-17 00:00:00.00                                        
##                                                                        
##   record_date                      score           street         
##  Min.   :2017-10-19 06:00:49   Min.   :  0.00   Length:7013       
##  1st Qu.:2017-10-19 06:00:49   1st Qu.: 41.00   Class :character  
##  Median :2017-10-19 06:00:49   Median : 51.00   Mode  :character  
##  Mean   :2017-10-19 06:00:49   Mean   : 53.57                     
##  3rd Qu.:2017-10-19 06:00:49   3rd Qu.: 65.00                     
##  Max.   :2017-10-19 06:00:49   Max.   :151.00                     
##                                                                   
##  violation_code     violation_description    zipcode         grade          
##  Length:7013        Length:7013           Min.   :10001   Length:7013       
##  Class :character   Class :character      1st Qu.:10027   Class :character  
##  Mode  :character   Mode  :character      Median :11106   Mode  :character  
##                                           Mean   :10737                     
##                                           3rd Qu.:11232                     
##                                           Max.   :11694                     
##                                                                             
##    grade_date  
##  Min.   :NA    
##  1st Qu.:NA    
##  Median :NA    
##  Mean   :NaN   
##  3rd Qu.:NA    
##  Max.   :NA    
##  NA's   :7013
rest_inspec |> 
  filter(critical_flag == "Critical" & str_detect(action, "[Cc]losed")) |> 
  group_by(cuisine_description) |> 
  summarize(closed = n()) |> 
  arrange(desc(closed)) |> 
  mutate(
    cuisine_description = factor(cuisine_description),
    cuisine_description = fct_reorder(cuisine_description, closed)
  ) |> 
  plot_ly(
    
  )
## Warning: No trace type specified and no positional attributes specified
## No trace type specified:
##   Based on info supplied, a 'scatter' trace seems appropriate.
##   Read more about this trace type -> https://plotly.com/r/reference/#scatter
## No scatter mode specifed:
##   Setting the mode to markers
##   Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
rest_inspec |> 
  filter(critical_flag == "Critical" & str_detect(action, "[Cc]losed"))  |> 
  plot_ly()
## Warning: No trace type specified and no positional attributes specified
## No trace type specified:
##   Based on info supplied, a 'scatter' trace seems appropriate.
##   Read more about this trace type -> https://plotly.com/r/reference/#scatter
## No scatter mode specifed:
##   Setting the mode to markers
##   Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode

Chart